615 research outputs found

    An effective, low-cost measure of semantic relatedness obtained from Wikipedia links

    Get PDF
    This paper describes a new technique for obtaining measures of semantic relatedness. Like other recent approaches, it uses Wikipedia to provide structured world knowledge about the terms of interest. Out approach is unique in that it does so using the hyperlink structure of Wikipedia rather than its category hierarchy or textual content. Evaluation with manually defined measures of semantic relatedness reveals this to be an effective compromise between the ease of computation of the former approach and the accuracy of the latter

    A competitive environment for exploratory query expansion

    Get PDF
    Most information workers query digital libraries many times a day. Yet people have little opportunity to hone their skills in a controlled environment, or compare their performance with others in an objective way. Conversely, although search engine logs record how users evolve queries, they lack crucial information about the user's intent. This paper describes an environment for exploratory query expansion that pits users against each other and lets them compete, and practice, in their own time and on their own workstation. The system captures query evolution behavior on predetermined information-seeking tasks. It is publicly available, and the code is open source so that others can set up their own competitive environments

    Extracting corpus specific knowledge bases from Wikipedia

    Get PDF
    Thesauri are useful knowledge structures for assisting information retrieval. Yet their production is labor-intensive, and few domains have comprehensive thesauri that cover domain-specific concepts and contemporary usage. One approach, which has been attempted without much success for decades, is to seek statistical natural language processing algorithms that work on free text. Instead, we propose to replace costly professional indexers with thousands of dedicated amateur volunteers--namely, those that are producing Wikipedia. This vast, open encyclopedia represents a rich tapestry of topics and semantics and a huge investment of human effort and judgment. We show how this can be directly exploited to provide WikiSauri: manually-defined yet inexpensive thesaurus structures that are specifically tailored to expose the topics, terminology and semantics of individual document collections. We also offer concrete evidence of the effectiveness of WikiSauri for assisting information retrieval

    Mining Domain-Specific Thesauri from Wikipedia: A case study

    Get PDF
    Domain-specific thesauri are high-cost, high-maintenance, high-value knowledge structures. We show how the classic thesaurus structure of terms and links can be mined automatically from Wikipedia. In a comparison with a professional thesaurus for agriculture we find that Wikipedia contains a substantial proportion of its concepts and semantic relations; furthermore it has impressive coverage of contemporary documents in the domain. Thesauri derived using our techniques capitalize on existing public efforts and tend to reflect contemporary language usage better than their costly, painstakingly-constructed manual counterparts

    Applying Wikipedia to Interactive Information Retrieval

    Get PDF
    There are many opportunities to improve the interactivity of information retrieval systems beyond the ubiquitous search box. One idea is to use knowledge bases—e.g. controlled vocabularies, classification schemes, thesauri and ontologies—to organize, describe and navigate the information space. These resources are popular in libraries and specialist collections, but have proven too expensive and narrow to be applied to everyday webscale search. Wikipedia has the potential to bring structured knowledge into more widespread use. This online, collaboratively generated encyclopaedia is one of the largest and most consulted reference works in existence. It is broader, deeper and more agile than the knowledge bases put forward to assist retrieval in the past. Rendering this resource machine-readable is a challenging task that has captured the interest of many researchers. Many see it as a key step required to break the knowledge acquisition bottleneck that crippled previous efforts. This thesis claims that the roadblock can be sidestepped: Wikipedia can be applied effectively to open-domain information retrieval with minimal natural language processing or information extraction. The key is to focus on gathering and applying human-readable rather than machine-readable knowledge. To demonstrate this claim, the thesis tackles three separate problems: extracting knowledge from Wikipedia; connecting it to textual documents; and applying it to the retrieval process. First, we demonstrate that a large thesaurus-like structure can be obtained directly from Wikipedia, and that accurate measures of semantic relatedness can be efficiently mined from it. Second, we show that Wikipedia provides the necessary features and training data for existing data mining techniques to accurately detect and disambiguate topics when they are mentioned in plain text. Third, we provide two systems and user studies that demonstrate the utility of the Wikipedia-derived knowledge base for interactive information retrieval

    Clustering documents with active learning using Wikipedia

    Get PDF
    Wikipedia has been applied as a background knowledge base to various text mining problems, but very few attempts have been made to utilize it for document clustering. In this paper we propose to exploit the semantic knowledge in Wikipedia for clustering, enabling the automatic grouping of documents with similar themes. Although clustering is intrinsically unsupervised, recent research has shown that incorporating supervision improves clustering performance, even when limited supervision is provided. The approach presented in this paper applies supervision using active learning. We first utilize Wikipedia to create a concept-based representation of a text document, with each concept associated to a Wikipedia article. We then exploit the semantic relatedness between Wikipedia concepts to find pair-wise instance-level constraints for supervised clustering, guiding clustering towards the direction indicated by the constraints. We test our approach on three standard text document datasets. Empirical results show that our basic document representation strategy yields comparable performance to previous attempts; and adding constraints improves clustering performance further by up to 20%

    Chabauty-Coleman experiments for genus 3 hyperelliptic curves

    Full text link
    We describe a computation of rational points on genus 3 hyperelliptic curves CC defined over Q\mathbb{Q} whose Jacobians have Mordell-Weil rank 1. Using the method of Chabauty and Coleman, we present and implement an algorithm in Sage to compute the zero locus of two Coleman integrals and analyze the finite set of points cut out by the vanishing of these integrals. We run the algorithm on approximately 17,000 curves from a forthcoming database of genus 3 hyperelliptic curves and discuss some interesting examples where the zero set includes global points not found in C(Q)C(\mathbb{Q}).Comment: 18 page

    Using different Facebook advertisements to recruit men for an online mental health study: Engagement and selection bias.

    Get PDF
    A growing number of researchers are using Facebook to recruit for a range of online health, medical, and psychosocial studies. There is limited research on the representativeness of participants recruited from Facebook, and the content is rarely mentioned in the methods, despite some suggestion that the advertisement content affects recruitment success. This study explores the impact of different Facebook advertisement content for the same study on recruitment rate, engagement, and participant characteristics. Five Facebook advertisement sets ("resilience", "happiness", "strength", "mental fitness", and "mental health") were used to recruit male participants to an online mental health study which allowed them to find out about their mental health and wellbeing through completing six measures. The Facebook advertisements recruited 372 men to the study over a one month period. The cost per participant from the advertisement sets ranged from 0.55to0.55 to 3.85 Australian dollars. The "strength" advertisements resulted in the highest recruitment rate, but participants from this group were least engaged in the study website. The "strength" and "happiness" advertisements recruited more younger men. Participants recruited from the "mental health" advertisements had worse outcomes on the clinical measures of distress, wellbeing, strength, and stress. This study confirmed that different Facebook advertisement content leads to different recruitment rates and engagement with a study. Different advertisement also leads to selection bias in terms of demographic and mental health characteristics. Researchers should carefully consider the content of social media advertisements to be in accordance with their target population and consider reporting this to enable better assessment of generalisability

    Effects of gamma irradiation on the biomechanical properties of peroneus tendons

    Get PDF
    PURPOSE: This study was designed to investigate the biomechanical properties of nonirradiated (NI) and irradiated (IR) peroneus tendons to determine if they would be suitable allografts, in regards to biomechanical properties, for anterior cruciate ligament reconstruction after a dose of 1.5–2.5 Mrad. METHODS: Seven pairs of peroneus longus (PL) and ten pairs of peroneus brevis (PB) tendons were procured from human cadavers. The diameter of each allograft was measured. The left side of each allograft was IR at 1.5–2.5 Mrad, whereas the right side was kept aseptic and NI. The allografts were thawed, kept wet with saline, and attached in a single-strand fashion to custom freeze grips using liquid nitrogen. A preload of 10 N was then applied and, after it had reached steady state, the allografts were pulled at 4 cm/sec. The parameters recorded were the displacement and force. RESULTS: The elongation at the peak load was 10.3±2.3 mm for the PB NI side and 13.5±3.3 mm for the PB IR side. The elongation at the peak load was 17.4±5.3 mm for the PL NI side and 16.3±2.0 mm for the PL IR side. For PL, the ultimate load was 2,091.6±148.7 N for NI and 2,122.8±380.0 N for IR. The ultimate load for the PB tendons was 1,485.7±209.3 N for NI and 1,318.4±296.9 N for the IR group. The ultimate stress calculations for PL were 90.3±11.3 MPa for NI and 94.8±21.0 MPa for IR. For the PB, the ultimate stress was 82.4±19.0 MPa for NI and 72.5±16.6 MPa for the IR group. The structural stiffness was 216.1±59.0 N/mm for the NI PL and 195.7±51.4 N/mm for the IR side. None of these measures were significantly different between the NI and IR groups. The structural stiffness was 232.1±45.7 N/mm for the NI PB and 161.9±74.0 N/mm for the IR side, and this was the only statistically significant difference found in this study (P=0.034). CONCLUSION: Our statistical comparisons found no significant differences in terms of elongation, ultimate load, or ultimate stress between IR and NI PB and PL tendons. Only the PB structural stiffness was affected by irradiation. Thus, sterilizing allografts at 1.5–2.5 Mrad of gamma irradiation does not cause major alterations in the tendons’ biomechanical properties while still providing a suitable amount of sterilization for anterior cruciate ligament reconstruction

    Right Temporoparietal Gray Matter Predicts Accuracy of Social Perception in the Autism Spectrum

    Get PDF
    Individuals with an autism spectrum disorder (ASD) show hallmark deficits in social perception. These difficulties might also reflect fundamental deficits in integrating visual signals. We contrasted predictions of a social perception and a spatial–temporal integration deficit account. Participants with ASD and matched controls performed two tasks: the first required spatiotemporal integration of global motion signals without social meaning, the second required processing of socially relevant local motion. The ASD group only showed differences to controls in social motion evaluation. In addition, gray matter volume in the temporal–parietal junction correlated positively with accuracy in social motion perception in the ASD group. Our findings suggest that social–perceptual difficulties in ASD cannot be reduced to deficits in spatial–temporal integration
    corecore